AITopics

2601.08964

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.71)

Battiston, Marco, Luo, Yu

An Infinite BART model

arXiv.org Machine LearningNov-26-2025

Bayesian additive regression trees (BART) are popular Bayesian ensemble models used in regression and classification analysis. Under this modeling framework, the regression function is approximated by an ensemble of decision trees, interpreted as weak learners that capture different features of the data. In this work, we propose a generalization of the BART model that has two main features: first, it automatically selects the number of decision trees using the given data; second, the model allows clusters of observations to have different regression functions since each data point can only use a selection of weak learners, instead of all of them. This model generalization is accomplished by including a binary weight matrix in the conditional distribution of the response variable, which activates only a specific subset of decision trees for each observation. Such a matrix is endowed with an Indian Buffet process prior, and sampled within the MCMC sampler, together with the other BART parameters. We then compare the Infinite BART model with the classic one on simulated and real datasets. Specifically, we provide examples illustrating variable importance, partial dependence and causal estimation.

bart, bart model, regression function, (15 more...)

2511.20087

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Alcantara, Rafael, Hahn, P. Richard, Carvalho, Carlos, Lopes, Hedibert

Learning Conditional Average Treatment Effects in Regression Discontinuity Designs using Bayesian Additive Regression Trees

arXiv.org Machine LearningFeb-28-2025

Such designs arise when treatment assignment is based on whether a particular covariate -- referred to as the running variable -- lies above or below a known value, referred to as the cutoff value. Because treatment is deterministically assigned as a known function of the running variable, RDDs are trivially deconfounded: treatment assignment is independent of the outcome variable, given the running variable (because treatment is conditionally constant). However, estimation of treatment effects in RDDs is more complicated than simply controlling for the running variable, because doing so introduces a complete lack of overlap, which is the other key condition needed to justify regression adjustment for causal inference. Nonetheless, treatment effects at the cutoff may still be identified. Specifically, it is well-known that treatment effects at the cutoff can be estimated from RDDs as the magnitude of a discontinuity in the conditional mean response function at that point (Hahn et al., 2001). This paper investigates the use of Bayesian additive regression tree models (Chipman et al., 2010; Hahn et al., 2020) for the purpose of estimating conditional average treatments effects (CATE) at the cutoff, conditional on observed covariates other than the running variable. To the best of our knowledge, such data-driven CATE estimation has not been a focus of the existing RDD literature and we are the first to propose BART for this purpose.

estimation, regression tree, treatment effect, (12 more...)

2503.00326

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Arizona (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report (1.00)
Overview (0.68)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

arXiv.org Artificial IntelligenceFeb-11-2025

Bridging Brain Signals and Language: A Deep Learning Approach to EEG-to-Text Decoding

Gedawy, Mostafa El, Nabil, Omnia, Mamdouh, Omar, Nady, Mahmoud, Adel, Nour Alhuda, Fares, Ahmed

Brain activity translation into human language delivers the capability to revolutionize machine-human interaction while providing communication support to people with speech disability. Electronic decoding reaches a certain level of achievement yet current EEG-to-text decoding methods fail to reach open vocabularies and depth of meaning and individual brain-specific variables. We introduce a special framework which changes conventional closed-vocabulary EEG-to-text decoding approaches by integrating subject-specific learning models with natural language processing methods to resolve detection obstacles. This method applies a deep representation learning approach to extract important EEG features which allow training of neural networks to create elaborate sentences that extend beyond original data content. The ZuCo dataset analysis demonstrates that research findings achieve higher BLEU, ROUGE and BERTScore performance when compared to current methods. The research proves how this framework functions as an effective approach to generate meaningful and correct texts while understanding individual brain variations. The proposed research aims to create a connection between open-vocabulary Text generation systems and human brain signal interpretation for developing efficacious brain-to-text systems. The research produces interdisciplinary effects through innovative assistive technology development and personalized communication systems which extend possibilities for human-computer interaction in various settings.

eeg signal, module, representation, (13 more...)

2502.17465

Country:

North America > United States > Florida > Dade County (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Africa > Middle East > Egypt (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Diaf, Alaeddine, Korba, Abdelaziz Amara, Karabadji, Nour Elislem, Ghamri-Doudane, Yacine

BARTPredict: Empowering IoT Security with LLM-Driven Cyber Threat Prediction

arXiv.org Artificial IntelligenceJan-3-2025

The integration of Internet of Things (IoT) technology in various domains has led to operational advancements, but it has also introduced new vulnerabilities to cybersecurity threats, as evidenced by recent widespread cyberattacks on IoT devices. Intrusion detection systems are often reactive, triggered by specific patterns or anomalies observed within the network. To address this challenge, this work proposes a proactive approach to anticipate and preemptively mitigate malicious activities, aiming to prevent potential damage before it occurs. This paper proposes an innovative intrusion prediction framework empowered by Pre-trained Large Language Models (LLMs). The framework incorporates two LLMs: a fine-tuned Bidirectional and AutoRegressive Transformers (BART) model for predicting network traffic and a fine-tuned Bidirectional Encoder Representations from Transformers (BERT) model for evaluating the predicted traffic. By harnessing the bidirectional capabilities of BART the framework then identifies malicious packets among these predictions. Evaluated using the CICIoT2023 IoT attack dataset, our framework showcases a notable enhancement in predictive performance, attaining an impressive 98% overall accuracy, providing a powerful response to the cybersecurity challenges that confront IoT networks.

large language model, machine learning, natural language, (22 more...)

2501.01664

Country: Africa > Middle East > Algeria (0.15)

Genre:

Research Report (1.00)
Overview (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Goh, Yong Chen, Soh, Wuu Kuang, Parnell, Andrew C., Murphy, Keefe

Joint Models for Handling Non-Ignorable Missing Data using Bayesian Additive Regression Trees: Application to Leaf Photosynthetic Traits Data

arXiv.org Machine LearningDec-19-2024

Dealing with missing data poses significant challenges in predictive analysis, often leading to biased conclusions when oversimplified assumptions about the missing data process are made. In cases where the data are missing not at random (MNAR), jointly modeling the data and missing data indicators is essential. Motivated by a real data application with partially missing multivariate outcomes related to leaf photosynthetic traits and several environmental covariates, we propose two methods under a selection model framework for handling data with missingness in the response variables suitable for recovering various missingness mechanisms. Both approaches use a multivariate extension of Bayesian additive regression trees (BART) to flexibly model the outcomes. The first approach simultaneously uses a probit regression model to jointly model the missingness. In scenarios where the relationship between the missingness and the data is more complex or non-linear, we propose a second approach using a probit BART model to characterize the missing data process, thereby employing two BART models simultaneously. Both models also effectively handle ignorable covariate missingness. The efficacy of both models compared to existing missing data approaches is demonstrated through extensive simulations, in both univariate and multivariate settings, and through the aforementioned application to the leaf photosynthetic trait data.

data quality, machine learning, missingness, (19 more...)

2412.14946

Country:

Europe > Ireland (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)

Genre: Research Report (1.00)

Industry: Materials > Chemicals (0.46)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Avram, Andrei-Marius, Timpuriu, Mircea, Iuga, Andreea, Matei, Vlad-Cristian, Tăiatu, Iulian-Marius, Găină, Tudor, Cercel, Dumitru-Clementin, Pop, Florin, Cercel, Mihaela-Claudia

RoLargeSum: A Large Dialect-Aware Romanian News Dataset for Summary, Headline, and Keyword Generation

arXiv.org Artificial IntelligenceDec-15-2024

Using supervised automatic summarisation methods requires sufficient corpora that include pairs of documents and their summaries. Similarly to many tasks in natural language processing, most of the datasets available for summarization are in English, posing challenges for developing summarization models in other languages. Thus, in this work, we introduce RoLargeSum, a novel large-scale summarization dataset for the Romanian language crawled from various publicly available news websites from Romania and the Republic of Moldova that were thoroughly cleaned to ensure a high-quality standard. RoLargeSum contains more than 615K news articles, together with their summaries, as well as their headlines, keywords, dialect, and other metadata that we found on the targeted websites. We further evaluated the performance of several BART variants and open-source large language models on RoLargeSum for benchmarking purposes. We manually evaluated the results of the best-performing system to gain insight into the potential pitfalls of this data set and future development.

large language model, machine learning, natural language, (20 more...)

2412.11317

Country:

Europe > Moldova (0.25)
Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.04)
Asia > Russia > Far Eastern Federal District > Sakha Republic (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Nair, Sindhu, Rao, Y. S., Shankarmani, Radha

Assessment of Transformer-Based Encoder-Decoder Model for Human-Like Summarization

arXiv.org Artificial IntelligenceOct-22-2024

In recent times, extracting valuable information from large text is making significant progress. Especially in the current era of social media, people expect quick bites of information. Automatic text summarization seeks to tackle this by slimming large texts down into more manageable summaries. This important research area can aid in decision-making by digging out salient content from large text. With the progress in deep learning models, significant work in language models has emerged. The encoder-decoder framework in deep learning has become the central approach for automatic text summarization. This work leverages transformer-based BART model for human-like summarization which is an open-ended problem with many challenges. On training and fine-tuning the encoder-decoder model, it is tested with diverse sample articles and the quality of summaries of diverse samples is assessed based on human evaluation parameters. Further, the finetuned model performance is compared with the baseline pretrained model based on evaluation metrics like ROUGE score and BERTScore. Additionally, domain adaptation of the model is required for improved performance of abstractive summarization of dialogues between interlocutors. On investigating, the above popular evaluation metrics are found to be insensitive to factual errors. Further investigation of the summaries generated by finetuned model is done using the contemporary evaluation metrics of factual consistency like WeCheck and SummaC. Empirical results on BBC News articles highlight that the gold standard summaries written by humans are more factually consistent by 17% than the abstractive summaries generated by finetuned model.

bertscore, rouge score and bertscore, summarization, (12 more...)

2410.16842

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > India > Maharashtra > Mumbai (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre:

Research Report (1.00)
Overview (0.93)

Industry: Media > News (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Souto, Hugo Gobato, Neto, Francisco Louzada

K-Fold Causal BART for CATE Estimation

arXiv.org Machine LearningSep-9-2024

This research aims to propose and evaluate a novel model named K-Fold Causal Bayesian Additive Regression Trees (K-Fold Causal BART) for improved estimation of Average Treatment Effects (ATE) and Conditional Average Treatment Effects (CATE). The study employs synthetic and semi-synthetic datasets, including the widely recognized Infant Health and Development Program (IHDP) benchmark dataset, to validate the model's performance. Despite promising results in synthetic scenarios, the IHDP dataset reveals that the proposed model is not state-of-the-art for ATE and CATE estimation. Nonetheless, the research provides several novel insights: 1. The ps-BART model is likely the preferred choice for CATE and ATE estimation due to better generalization compared to the other benchmark models - including the Bayesian Causal Forest (BCF) model, which is considered by many the current best model for CATE estimation, 2. The BCF model's performance deteriorates significantly with increasing treatment effect heterogeneity, while the ps-BART model remains robust, 3. Models tend to be overconfident in CATE uncertainty quantification when treatment effect heterogeneity is low, 4. A second K-Fold method is unnecessary for avoiding overfitting in CATE estimation, as it adds computational costs without improving performance, 5. Detailed analysis reveals the importance of understanding dataset characteristics and using nuanced evaluation methods, 6. The conclusion of Curth et al. (2021) that indirect strategies for CATE estimation are superior for the IHDP dataset is contradicted by the results of this research. These findings challenge existing assumptions and suggest directions for future research to enhance causal inference methodologies.

cate estimation, dataset, estimation, (13 more...)

2409.05665

Country:

North America > United States (0.46)
South America > Brazil > São Paulo (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Singapore (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Research Report > Strength High (0.93)

Industry:

Government (1.00)
Education (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.67)

Priyadarsini, Indra, Sharma, Vidushi, Takeda, Seiji, Kishimoto, Akihiro, Hamada, Lisa, Shinohara, Hajime

Improving Performance Prediction of Electrolyte Formulations with Transformer-based Molecular Representation Model

arXiv.org Artificial IntelligenceJun-28-2024

Development of efficient and high-performing electrolytes is crucial for advancing energy storage technologies, particularly in batteries. Predicting the performance of battery electrolytes rely on complex interactions between the individual constituents. Consequently, a strategy that adeptly captures these relationships and forms a robust representation of the formulation is essential for integrating with machine learning models to predict properties accurately. In this paper, we introduce a novel approach leveraging a transformer-based molecular representation model to effectively and efficiently capture the representation of electrolyte formulations. The performance of the proposed approach is evaluated on two battery property prediction tasks and the results show superior performance compared to the state-of-the-art methods.

electrolyte formulation, formulation, representation, (9 more...)

2406.19792

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
South America > Brazil (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > San Jose (0.04)

Genre:

Research Report > Promising Solution (0.54)
Research Report > New Finding (0.34)

Industry:

Energy > Energy Storage (1.00)
Electrical Industrial Apparatus (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.63)